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Abstract 

Complex networks have been studied extensively due to their relevance to many real systems 
as diverse as the World- Wide- Web (WWW), the Internet, energy landscapes, biological and social 



networks 



mm 



]. A large number of real networks are called "scale- free" because they show a 
power-law distribution of the number of links per node QjOjOi- However, it is widely believed that 
complex networks are not length-scale invariant or self-similar. This conclusion originates from 
the "small-world" property of these networks, which implies that the number of nodes increases 
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exponentially with the "diameter" of the network [S? ISI? llOl? UH ? rather than the power-law relation 
expected for a self-similar structure. Nevertheless, here we present a novel approach to the analysis 
of such networks, revealing that their structure is indeed self-similar. This result is achieved by the 
application of a renormalization procedure which coarse-grains the system into boxes containing 
nodes within a given "size". Concurrently, we identify a power-law relation between the number 
of boxes needed to cover the network and the size of the box defining a finite self-similar exponent. 
These fundamental properties, which are shown for the WWW, social, cellular and protein-protein 
interaction networks, help to understand the emergence of the scale-free property in complex 
networks. They suggest a common self-organization dynamics of diverse networks at different 
scales into a critical state and in turn bring together previously unrelated fields: the statistical 
physics of complex networks with renormalization group, fractals and critical phenomena. 



1 



Two fundamental properties of real complex networks have attracted much attention 
recently: the small-world and the scale-free properties. Many naturally occurring networks 
are small world since one can reach a given node from another one, following the path 
with the smallest number of links between the nodes, in a very small number of steps. 
This corresponds to the so-called "six degrees of separation" in social networks Q]. It 
is mathematically expressed by the slow (logarithmic) increase of the average diameter 
of the network, £, with the total number of nodes N, £ ^ InN, where £ is the shortest 
d.tance between two nodes and defines the distance n>etnc tn complex networf. BBBQ. 
Equivalently, we obtain: 

7V-e^"/^^ (1) 

where £o is a characteristic length. 

A second fundamental property in the study of complex networks arises with the discovery 
that the probability distribution of the number of links per node, P{k) (also known as the 
degree distribution) , can be represented by a power-law (scale-free) with a degree exponent 
7 usually in the range 2 < 7 < 3 jol], 

P{k) - k-\ (2) 
These discoveries have been confirmed in many empirical studies of diverse networks 

With the aim of providing a deeper understanding of the underlying mechanism which 
leads to these common features one needs to probe the patterns within the network structure 
in more detail. The question of connectivity between groups of interconnected nodes on 
different length-scales has received less attention. Yet, a plethora of examples in Nature 
exhibits the importance of collective behavior, from interactions between communities within 
social networks, links between clusters of web-sites of similar subjects, all the way to the 
highly modular manner in which molecules interact to keep a cell alive. Here we show that 
real complex networks, such as WWW, social, protein-protein interaction networks (PIN) 
and cellular networks are indeed constructed of self-repeating patterns on all length-scales, 
and are therefore invariant or self-similar under a length-scale transformation. 

This result comes as a surprise since the exponential increase in Eq. (P) has led to 
the general understanding that complex networks are not self-similar, since self-similarity 
requires a power-law relation between and £. 



How can one reconcile the exponential increase in Eq. (P) with self-similarity, or in 
other words an underlying /en^'i/^-scale-invariant topology? At the root of the self-similar 
properties that we unravel in this study is a scale-invariant renormalization procedure which 
we show to be valid for dissimilar complex networks. 

In order to demonstrate this concept we first consider a self-similar network embedded 
in Euclidean space, of which a classical example would be a fractal percolation cluster at 
criticality jl^ • In order to unfold the self-similar properties of such clusters we calculate the 
fractal dimension using a "box counting" method and a "cluster growing" method jl^l - 

In the first method we cover the percolation cluster with A^^ boxes of linear size is- The 
fractal dimension or box dimension rf^ is then given by jlJ |: 

NB^r/-, (3) 

In the second method, the network is not covered with boxes, instead one seed node is 
chosen at random and a cluster of nodes centered at the seed and separated by a minimum 
distance i is calculated. The procedure is then repeated by choosing many seed nodes at 
random and the average "mass" of the resulting clusters ((Mc), defined as the number of 
nodes in the cluster) is calculated as a function of i to obtain the following scaling: 

{M,)r^i''f, (4) 

defining the fractal cluster dimension df [14]. Comparing Eq. and (P) implies that 
rf/ = oc for complex small- world networks. 

For a homogeneous network characterized by a narrow degree distribution (such as a 
fractal percolation cluster) the box covering method of Eq. ^ and the cluster growing 
method of Eq. are equivalent since every node typically has the same number of links 
or neighbors. Equation can then be derived from dHj) and cIb = df^ and this relation has 
been regularly used. 

The crux of the matter is to understand how one can calculate a self-similar exponent 
(analogous to the fractal dimension in Euclidean space) in complex inhomogeneous networks 
with a hroad degree distribution such as Eq. (j2|). Under these conditions Eqs. ((Hj) and 
are not equivalent as will be shown below. The application of the proper covering procedure 
in the box counting method, Eq. for complex networks unveils a set of self-similar 
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properties such as a finite self-similar exponent and a new set of critical exponents for the 
scale-invariant topology. 

Figure QJi illustrates the box covering method using a schematic network composed of 8 
nodes. For each value of the box size we search for the number of boxes needed to tile 
the entire network such that each box contains nodes separated by a distance £ < £b- 

This procedure is applied to several different real networks: (i) Sl part of the WWW 
composed of 325,729 web-pages which are connected if there is a URL link from one page 
to another j^l (|http:/ /www.nd.edu/^netw orksj ), (ii) a social network where the nodes are 
392,340 actors linked if they were cast together in at least one movie [15], (in) the biological 
networks of protein-protein interactions found in E. coli (429 proteins) and H. sapiens (hu- 
man) (946 proteins) linked if there is a physical binding between them (database available 
via the Database of Interacting Proteins ly,ll7|, other PINs are discussed in the Supplemen- 
tary Materials), and (iv) the cellular networks compiled by [18] using a graph-theoretical 
representation of the whole biochemical pathways based on the WIT integrated-pathway 



genome database (http: / /igweb. integratedgenomics.com/IGwit ) of 43 species from Ar- 



chaea. Bacteria, and Eukarya. Here we show the results for A. fulgidus, E. coli and C. 
elegans [18], while the full database is analyzed in the Supplementary Materials. It has been 
previously determined that the WWW and actors networks are small- world and scale- free, 
characterized by Eq. ^ with 7 = 2.6 and 2.2, respectively Q|. For the PINs of E. coli and 
H. sapiens we find 7 = 2.2 and 2.1, respectively. All cellular networks are scale- free with 
average exponent 7 = 2.2 Q]. We confirm these values and show the results for the WWW 
in Fig. El 

Figures (2^ and (SJ) show the results of Nb{£b) according to Eq. (jSj). They reveal the 
existence of self-similarity in the WWW, actors, and E. coli and H. sapiens protein-protein 
interaction networks with self-similar exponents ds = 4.1, ds = 6.3 and ds = 2.3 and 
dB = 2.3, respectively. The cellular networks are shown in Fig. I2t and have rf^ = 3.5. 

We now elaborate on the apparent contradiction between the two definitions of self- 
similar exponents in complex networks. After performing a renormalization at a given 
we calculate the mean mass of the boxes covering the network, {Mb{£b))^ to obtain 

{MB{iB))^N/NB{iB)--i1f, (5) 

which is corroborated by direct measurements for all the networks and shown in Fig. Ofe. for 
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the WWW. 

On the other hand, the average performed in the cluster growing method (for this calcu- 
lation we average over single boxes without tiling the system) gives rise to an exponential 
growth of the mass 

(Me(M) ^ e^-/^\ (6) 

with £i ^ 0.78 in accordance with the small- world effect Eq. (P), as seen in Fig. 

The topology of scale-free networks is dominated by several highly connected hubs — the 
nodes with the largest degree — implying that most of the nodes are connected to the hubs 
via one or very few steps. Therefore the average performed in the cluster growing method 
is biased; the hubs are overrepresented in Eq. © since almost every node is a neighbor of 
a hub. By choosing the seed of the clusters at random, there is a very large probability of 
including the hubs in the clusters. On the other hand the box covering method is a global 
tiling of the system providing a flat average over all the nodes, i.e. each part of the network 
is covered with an equal probability. Once a hub (or any node) is covered, it cannot be 
covered again. We conclude that Eqs. © and are not equivalent for inhomogeneous 
networks with topologies dominated by hubs with a large degree. 

The biased sampling of the randomly chosen nodes is clearly demonstrated in Fig. [Hb- 
We flnd that the probability distribution of the mass of the boxes for a given is very 
broad and can be approximated by a power-law: P^^(M^) ~ M^^'^ in the case of WWW 
and £b = 4:. On the other hand, the probability distribution of Mc is very narrow and can 
be fltted by a log-normal distribution (see Fig. Eb). In the box covering method there are 
many boxes with very large and very small masses in contrast to the peaked distribution 
in the cluster growing method, thus showing the biased nature of the latter method in 
inhomogeneous networks. This biased average leads to the exponential growth of the mass 
in Eq. dBj) and it also explains why the average distance is logarithmic with as in Eq. (P). 

The box counting method provides a powerful tool for further investigations of the net- 
work properties as it enables a renormalization procedure, revealing that the self-similar 
properties and the scale- free degree distribution persists irrespective of the amount of coarse- 
graining of the network. 

Subsequent to the flrst step of assigning the nodes to the boxes we create a new renor- 
malized network by replacing each box by a single node. Two boxes are then connected, 
provided that there was at least one link between their constituent nodes. The second 

5 



column of the panels in Fig. QJi shows this step in the renormalization procedure for the 
schematic network, while Fig. QJ) shows the results for the same procedure applied to the 
entire WWW for is = 3. 

The renormalized network gives rise to a new probability distribution of links, P{k')^ 
which is invariant under the renormalization: 

P[k) P{k') - {k')-\ (7) 

Figure [21i supports the validity of this scale transformation by showing a data collapse of 
all distributions with the same 7 according to for the WWW. 

Further insight arises from relating the scale-invariant properties to the scale-free 
degree distribution (jSJ. Plotting (see inset in Fig. [211 for the WWW) the number of links 
k' of each node in the renormalized network versus the maximum number of links k in each 
box of the unrenormalized network exhibits a scaling law 

k^k' = s{iB)k. (8) 

This equation defines the scaling transformation in the connectivity distribution. Empir- 
ically we find that the scaling factor s{< 1) scales with £b with a new exponent dk as 

siiB)-r/\ (9) 

shown in Fig. Ett for the WWW and actor networks (with dk = 2.5 and dk = 5.3, respec- 
tively), in Fig. for the protein networks {dk = 2.1 for E. coli and rf^^^ = 2.2 for H. sapiens) 
and in Fig. [2t for the cellular networks with dk = 3.2. 

Equations and Q shed light on how families of hierarchical sizes are linked together. 
The larger the families, the fewer links exist. Surprisingly the same power-law relation exists 
for large and small families represented by Eq. 

From Eq. ([Zj) we obtain n{k)dk = n\k')dk' ^ where n{k) = NP{k) is the number of 
nodes with links k and n\k') = N'P{k') is the number of nodes with links k' after the 
renormalization (A^^ is the total number of nodes in the renormalized network). Using Eq. 
(|H|) we obtain n{k) = s^~^n\k). Then, upon renormalizing a network with total nodes we 
obtain a smaller number of nodes A^^ according to N' = s^~'^N. Since the total number of 
nodes in the renormalized network is the number of boxes needed to cover the unrenormalized 
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network at any given we have A^' = Nb{^b)' Then, from Eqs. dHj) and we obtain the 
relation between the three indexes 



7 = 1 + V4. (10) 

Equation ffTUj) is confirmed for aU the networks analyzed here (see Supplementary Ma- 
terials). In all cases the calculation oi ds and dj. and Eq. (^Uj) gives rise to the same 7 
exponent as that obtained in the direct calculation of the degree distribution. The signif- 
icance of this result is that the scale-free properties characterized by 7 can be related to a 
more fundamental length-scale invariant property, characterized by the two new indexes ds 
and dk- 

Summarizing, we elucidate a fundamental property of a wide variety of complex networks: 
that of a scale-invariant topology. Concepts first introduced for the study of critical phe- 
nomena in statistical physics are shown to be valid here in the characterization of a diflFerent 
class of phenomena: the topology of complex networks. One could envisage a great deal of 
fundamental information being understood by the application of renormalization techniques 
to this kind of complex system. For instance, networks with similar degree distributions 
are characterized by different self-similar exponents, thus indicating that they may belong 
to different "universality classes". It is as though each node (ranging from web-pages in 
the WWW, to people in social networks, to proteins and substrates in cellular networks) 
were connected to other nodes under a single self-organizing principle according to which 
groups of nodes of all sizes self-organize too; such that everything links with everything else. 



governed by one universal dynamics in Nature 



2^. 
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FIG m The renormalization procedure to complex networks, a, Demonstration of the 
method for different £b and different stages in a network demo. The first column depicts 
the original network. We tile the system with boxes of size £b (different colors correspond 
to different boxes). All nodes in a box are connected by a minimum distance smaller than 
the given £b- For instance, in the case of £b = 2, we identify four boxes which contain 
the nodes depicted with colors red, orange, white, and blue, each containing 3, 2, 1, and 2 
nodes, respectively. Then we replace each box by a single node; two renormalized nodes are 
connected if there is at least one link between the unrenormalized boxes. Thus we obtain the 
network shown in the second column. The resulting number of boxes needed to tile the net- 
work, Nb{£b)) is plotted in Fig. El versus to obtain as in Eq. dHj). The renormalization 
procedure is applied again and repeated until the network is reduced to a single node (third 
and fourth columns for different £b)' b, Three stages in the renormalization scheme applied 
to the entire WWW. We fix the box size to = 3 and apply the renormalization for four 
stages. This corresponds, for instance, to the sequence for the network demo depicted in the 
second row in part a of this figure. We color the nodes in the web according to the boxes to 
which they belong. The network is invariant under this renormalization as explained in the 
legend of Fig. (211 and the Supplementary Materials. 

FIG El Self-similar scaling in complex networks, a, Upper panel: Log-log plot of the Nb 
V8 £b revealing the self-similarity of the WWW and actor network according to Eq. Q. 
Lower panel: The scaling of s{£b) vs. £b according to Eq. ©. The errors bars are of 
the order of the symbol size, b, Same as (a) but for two protein interaction networks: H. 
sapiens and E. coli. Results are analogous to (b) but with different scaling exponents, c, 
Same as (a) for the cellular networks of A. fulgidus^ E. coli and C. elegans. d, Invariance of 
the degree distribution of the WWW under the renormalization for different box sizes, £b' 
We show the data collapse of the degree distributions demonstrating the self-similarity at 
different scales. The inset shows the scaling of k' = s{£B)k for different £b^ from where we 
obtain the scaling factor s{£b)- Moreover, we also apply the renormalization for a fixed box 
size, for instance = 3 as shown in Fig. QJ) for the WWW, until the network is reduced to 
a few nodes and find that P{k) is invariant under these multiple renormalizations as well, 
for several iterations (see Supplementary Materials). 
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FIG. El Different averaging techniques lead to qualitatively different results, a, Mean 
value of the box mass in the box counting method, (M^), and the cluster mass in the cluster 
growing method, (Mc), for the WWW. The solid lines represent the power-law fit for {Mb) 
and the exponential fit for {M^) according to Eqs. © and dHj), respectively, b, Probability 
distribution of Mb and M^ for = 4 for the WWW. The curves are fitted by a power-law 
and a log-normal distribution, respectively. 
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FIG. 1: 
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FIG. 3: 
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SUPPLEMENTARY MATERIALS 



THE BOX COVERING METHOD 

Since the box covering method is central to the understanding of the scale-invariant 
properties of networks, we describe it in more detail here. Figure 0^ shows the same network 
as in Fig. QJi for the case = 2. We tile the system by first assigning nodes 1 and 2 to 
the box colored in blue. Notice that the maximum distance between the nodes of a given 
box is — 1. Thus, node 8 would not be in the blue box since its distance from node 2 is 
1 = 2 (even though its distance from 1 is ^ = 1). Then we cover the nodes 6 and 7 with the 
orange box, and the nodes 3, 4, and 5 with the red box. Finally, the last node 8 is assigned 
to the green box. The number of boxes to cover the network is then A^^ = 4. 

The renormalization is then applied by replacing each box by a single node. Thus, nodes 
1 and 2 will be combined into a single node as indicated by the arrow from the first panel 
to the second panel in Fig. This renormalized node is connected with the orange and 
green boxes because there is a link between nodes 2 and 7, and 1 and 8, respectively. The 
same rule applies to the other boxes. The renormalized network is shown in the second 
panel. The system is then tiled again with boxes; in this case two boxes (blue and red) are 
needed to cover the entire network. The two boxes are then replaced by nodes and a second 
renormalized network is obtained as shown in the third panel. Finally, the last two nodes 
belong to the same (red) box and are replaced by a single node. 

This procedure is applied to the WWW in Fig. QJ). The main panel corresponds to the 
first stage in the renormalization of the web for = 3. The procedure is applied again 
obtaining the remaining panels in Fig. QJ) until the web is reduced to a single box in the 
last panel. The colors of the nodes corresponds to the boxes to which they belong. 

In Fig. I2|i we show the invariance of the degree distribution P{k) under the renormaliza- 
tion performed as a function of the box size in the WWW. The other networks analyzed in 
this study present the same invariant property. It is important to mention that the networks 
are also invariant under multiple renormalizations applied for a fixed box size Ib- This cor- 
responds, for instance, to the stages depicted in Fig. QJi in the second row for = 3 for the 
network demo. Figure El shows the invariance of P{k) for the WWW after several stages of 
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FIG. 4: Details of the box covering method for a, = 2. b, A different covering for the same 
network as in (a) for = 2. Different coverings give raise to the same exponents as explained in 
the text. 




FIG. 5: Invariance of the degree distribution of the WWW under multiple renormalizations done 
at fixed = 3. The stages 1, 2, and 3 correspond to the networks depicted in the first three stages 
in Fig. nj). 

the renormalization for a fixed = 3, and it is the analogous of Fig. (211 for different box 
size. The stages 1, 2, and 3 correspond to the networks depicted in the first 3 stages in Fig. 
□b. 

From the above explanation it should be clear that there are many ways to tile the 
network. For instance in Fig. ^ we show another tiling. In this case we assign nodes 4 
and 7 together in a single box instead of nodes 6 and 7 as in Fig. 0^. This tiling results in 
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an extra box needed to cover node 6 and therefore in a larger number of nodes to tile the 
system, A^^ 5. 

While there are many ways to assign nodes to the boxes, we notice that the rigorous 
mathematical definition of Eq. (jSj) corresponds to the minimum number of boxes needed to 
cover the network This minimization does not have any consequence for the determi- 
nation of the fractal dimension in homogeneous clusters. However, it may become relevant 
when calculating the self-similar exponent of a complex network with a widely distributed 
number of links. Finding the minimum number of boxes to cover the network is a hard 
optimization problem to solve, analogous to the graph coloring problem in the NP-complete 
complexity class. This minimization problem has to be solved by an exhaustive numerical 
search since there is no numerical algorithm to solve this kind of problems. 

We have performed the search over a limited part of the phase-space for the WWW to 
obtain an estimation of the average and the minimum number of boxes needed to tile the 
network for every value of We find that the average value of the boxes is very close to 
the estimated minimum number of boxes. Moreover, we find that the minimization is not 
relevant and any covering gives rise to the same exponent. 



SCALE-FREE TREE STRUCTURE 



The underlying meaning of the existence of scale-free networks which are self-similar is 
yet to be deciphered, but some insight can be gained by examining the simplest structure 
of a known network of that kind: a tree network which has been characterized using field 

The sequence of renormalization steps depicted in Fig. suggests the following scheme: 
one begins with a single node and then constructs the network by applying the renormal- 
ization procedure in a reversed fashion. This can be achieved by following the procedure in 
Fig. E]for a specific value of 

More specifically, a single node with a large number of links is first connected to the 
next generation of nodes. For every node we assign a number of links from a power-law 
distribution with a given 7. The next layer of the tree is generated in the same way. A tree 
structure with a power-law degree distribution and self-similar topology emerges which is 
depicted in Fig. [HK- 
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FIG. 6: The scale-free tree structure and the random scale-free model, a, Example of a scale- 
free tree structure. Nodes with a power-law degree distribution are connected in a tree structure 
without loops, b, The log-log plot of Nb ys £b reveals a self-similar structure for the scale- free 
tree (upper panel) while s{£b) scales as in Eq. Q (lower panel). In contrast the random scale- free 
network where nodes (with a power-law distribution of links) are connected at random shows a 
lack of self-similarity expressed in the exponential decrease with in the upper panel. 

This is corroborated numerically in Fig. where we study a scale-free tree structure 
with 192,827 nodes and A = 2.3, and we find rf^ = 3.4 and dk = 2.5. The parallels between 
the features of such a simple structured network and those discussed in this paper suggest 
that this simplified view may lie at the core of more complex self-similar networks. 

Moreover, we also calculate the average mass of the boxes and the mass of the clusters 
in the box covering method and the cluster covering method, respectively, and we find the 
power law of Eq. © and the exponential behaviour of Eq. © (see Fig. [ZK) in agreement 
with the results of the real networks analyzed in the main manuscript. Fig. Ek- Figure [Zb 
shows the probability distribution of Mb (power-law) and Mc (log-normal) in agreement 
with previous results as well. Fig. Eb. 

INTERNET 



It is interesting to note that not all complex networks show the clear self-similarity of 
the networks presented so far. We analyze the Internet composed of computers and routers 
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FIG. 7: Results for the scale-free tree model, a, Mean value of the box mass in the box counting, 
(M^), and mean value of the cluster mass in the cluster growing method, (Mc) versus is- b, 
Probability distribution of Mb and Mc for = 5. The results are in agreement with the finding 
of real networks in Fig. [HI A power-law distribution is found for Mb while a log-normal distribution 
is found for Mc as shown by the fits. 

linked by physical lines such as the database collected by the SCAN project (the "Mbone", 
www.isi.edu/scan/scan.html, we also analyze the database of the Internet Mapping Project 
[2^ and found similar results). Figure [HI shows the result of Nb{£b)- We fit the curve with 
a modified power-law 

iVB(^B)~(^B+4)-'^ (11) 

with 4 = 14.9 representing a cut-off and rf^ = 8.5, suggesting a large self-similar exponent. 
The decay of Nb with is faster than a power-law and slower than exponential as shown 
in the inset of Fig. [HI 

Thus these networks lack the clear self-similar structure found for the WWW, actors and 
the biological networks. However, we find that the distribution of P[Mb) remains a power 
law and the degree distribution P{k) is invariant under the renormalization suggesting that 
some self-similar properties might still be valid for the Internet. We notice that Internet 
maps are made by programs that use the IP protocol to trace the connections between each 
registered node in the Internet. These maps are incomplete since they map a few routers 
from each domain and also due to the existence of firewalls. Thus, the apparent lack of 
self-similarity might be due to incomplete information of the network. 
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FIG. 8: Internet. Log-log plot of Nb{^b)- The solid line represents the modified power law fit, Eq. 
([TT|) . The inset shows a linear-log plot indicating that the decay is slower than exponential. 

PROTEIN-PROTEIN INTERACTION NETWORKS 

We also analyze the protein interaction networks of the fruit fly D. melanogaster as given 
in [2^, the bacterium H. pylori [24], the baker's yeast S. cerevisiae |l25j]. and the nematode 
worm C. elegans [26], which are all available via the DIP database |l2|. Figure [HI shows 
the results of Nb versus £b indicating that their behaviour is in between a pure power-law 
decay and a pure exponential. As with the Internet data, we are able to fit the results with 
Eq. f[TT|) with 4 = 7.2 and rf^ = 7.6 for C. elegans. For H. pylori and D. melanogaster 
the fit is a pure exponential Nb{£b) ^ exp(— with 4^1, while for S. cerevisiae the 
data could be fitted either by an exponential or by large values of 4 and cIb (note that the 
exponential is the limit of Eq. f[TT|) for 4 ^ oc, rf^ ^ oc and ic/ds = constant). On the 
other hand, we observe that for small scales, Nb seems to display the same power law found 
for E. coli and H. sapiens. The lack of clear self-similarity in these networks might be due 
to the incompleteness of these databases which are continuously being updated with newly 
discovered physical interactions [!£]. 
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FIG. 9: Scaling for the protein-protein interaction networks. Log-log plot of Nb versus £b foi" 
different protein-protein interaction networks. While E. coli and H. sapiens show a clear power law 
behavior, the other protein networks show a modified power-law behaviour or a pure exponential 
decay. The inset shows a hnear-log plot of Nb{£b)- 

RANDOM SCALE-FREE NETWORK 

Next we introduce an example of a model lacking self-similarity: the random scale-free 
model. This model consists of nodes to which a number of links are assigned with a power- 
law degree distribution and then connected randomly. Such a network shows a small world 
effect and a scale-free property but is not self-similar. We numerically find that the number 
of boxes decays exponentially with the box size (see Fig. Eb). Moreover, while Eq. (jHj) is 
still valid in this case, the power law relation in Eq. is replaced by an exponential law. 
We conjecture that the reason for this is a clustering of hubs; by assigning randomly the 
connections between the nodes, two nodes with a large number of links will have a large 
probability to be connected. This induces spatial correlations in the values of k which may 
explain the breakdown of self-similarity. In contrast, the simple tree-structure proposed 
above does not cluster the hubs by construction. A summary of our results is presented in 
Table m 
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FIG. 10: Barabasi-Albert model of scale-free networks with preferential attachment for 150,000 
nodes and m — tuq — 3 and m — mo — 5. mo is the initial number of nodes in the system and m is 
the number of links of a newly created node in the dynamical growth of the network Q]. Log-log 
plot of Nb versus Ib showing the lack of a power law behaviour. The inset shows a linear-log plot 
indicating that Nb decreases faster than exponential with Ib- 

THE BARABASI-ALBERT MODEL AND THE ERDOS-RENYI RANDOM 
GRAPH AT CRITICALITY 

We also analyzed the Barabasi-Albert model of complex networks (which introduces 
the concepts of preferential attachment to describe the dynamics of scale-free networks). 
The results of Nb{^b) are shown in Fig. ^Ifor different parameters in the model (see jl5[ 
for details) reveling that the structure is not self-similar; Nb seems to decrease faster than 
exponential with Ib- 

It is interesting to compare our results with the random Erdos-Renyi graph 0, 0] at the 
critical percolation threshold. In this case the largest cluster has self-similar properties and 
Eq. ©, {Mb{^b)) ^ 4^, is valid with rf^ = 2 |27|. We corroborate this result in Fig. fTTl 
showing the scaling of the number of boxes Nb with the box size Ib- However, for this case 
the network is not small-world since Eq. © is not valid — as well as Eq. (P) — but rather 
the mean distance ^scales as (Mc)^/^, i.e., a power-law relation rather than the logarithmic 
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FIG. 11: Erdos-Renyi random graph at criticality. Log-log plot of Nb versus Ib showing the 
self-similar exponent = 2 which is obtained for large distances. 



Network 


d-B 


dk 


1 + ds/dk 
Eq. (dUI) 


7 

Eq. 


WWW 


4.1 
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2.6 


Actor 


6.3 


5.3 


2.2 


2.2 


E. coli (PIN) 


2.3 


2.1 


2.1 


2.2 


H. sapiens (PIN) 


2.3 


2.2 


2.0 


2.1 


43 cellular networks 


3.5 


3.2 


2.1 


2.2 


Scale-free tree 


3.4 


2.5 


2.4 


2.3 



TABLE I: Summary of the exponents obtained for the scale-invariant networks studied in the 
manuscript. 

relation characteristic of small world networks. 
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CELLULAR NETWORKS 

The WIT database ( |http: / /igweb.integratedgenomics.com/IGwit| ) of cellular net- 
works considers the cellular functions divided according to bioengineering principles contain- 
ing datasets for intermediate metabolism and bioenergetics (core metabolism), information 
pathways, electron transport, and transmembrane transport. The metabolic network is a 
subset of all reactions that take place in the cell. Since this is the largest part of the network 
we analyze it separately and compare it with the full biochemical reaction network. The 
data presented in Fig. Eb represents the full biochemical reaction networks of only three sub- 
strates. Here we present results of the 43 different substrates represented in the database for 
the metabolic and full networks. The following figures show the results of Nb vs is- Both 
the metabolic and full networks display the power law relationship of self-similar networks 
with the same exponent (within error bars) for all the organisms considered (the metabolic 
networks show a finite size effect due to their smaller size). We find an average ds = 3.5. 
The solid line in the figures represent the average fit. The values are reported in Tabled 
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